How to Interpret SVD Units in Predictive Models ?

نویسنده

  • Goutam Chakraborty
چکیده

Recent studies suggest that unstructured data such as customer comments or feedback can enhance the power of existing predictive models. SAS® Text Miner can generate SVD (Singular value decomposition) units from text documents which is a vectorial representation of terms in documents. These SVDs when used as additional inputs along with the existing structured input variables often prove to capture the response better. However, SVD units are sort of black box variables and are not easy to interpret or explain. This is a big hindrance to win over the decision makers in the organizations to incorporate these derived textual data components in the models. In this paper, we demonstrate a new and powerful feature in SAS® Text Miner 12.1 which helps in explaining the SVDs or the text cluster components. We discuss two important methods useful to interpret them. For this purpose, we used data from a television network company which has transcripts of its call center notes from 3 prior calls of each customer. We are able to extract the key terms from the call center notes in the form of Boolean rules which have contributed to the prediction of customer churn. These rules provide an intuitive sense of which set of terms when occurring in either the presence or absence of another set of terms in the call center notes may lead up to a churn. It also provides insights into which customers are at a bigger risk of churning from the company’s services and more importantly why.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using game theory approach to interpret stable policies for Iran’s oil and gas common resources conflicts with Iraq and Qatar

Oil and gas as the non-renewable resources are considered very valuable for the countries with petroleum economics. These resources are not only diffused equally around the world, but also they are common in some places which their neighbors often come into conflicts. Consequently, it is vital for those countries to manage their resource utilization. Lately, game theory was applied in conflict ...

متن کامل

روشی جدید برای تفکیک و طبقه‌بندی توالی‌های سرطانی و غیرسرطانی DNA با استفاده از الگوریتم‌های مبتنی بر LPC و SVD

The growing pace of cancer has encouraged researchers to deliberate several aspects of this malignant disease. Genetic-induced nature of cancer, heighten the importance of studying intra-cell components. This paper has been carried out with the aim of making some specific and unique features clear from those long DNA sequences by employing well-established DNA sequence analysis techniques. The ...

متن کامل

Satisfaction Function in Present Undesirable Factors

Data Envelopment Analysis (DEA) is an efficient method to perform evaluation of units. In DEA we try to evaluate units with undesirable factors in input & outputs by satisfaction function, testing some models. On the other hand benefiting this concept, we can identify non-efficient units. Also we can recognize why these units are inefficient and calculate the reason of their inefficiency and ho...

متن کامل

Data Envelopment Analysis with LINGO Modeling for Technical Educational Group of an Organization

Data Envelopment Analysis (DEA) was developed to help compare the relative performance of decision-making units. It is a non-parametric method for performing frontier analysis. It uses linear programming to estimate the efficiency of multiple decision-making units and it is commonly used in production, management and economics [3]. DEA generates an efficiency score between 0 and 1 for each unit...

متن کامل

Bayesian Predictive Modelling: Application to Aircraft Short-Term Conflict Alert System

Bayesian Model Averaging (BMA), computationally feasible using Markov Chain Monte Carlo (MCMC), is a well-known method for reliable estimation of predictive distributions. The use of decision tree (DT) models for the averaging enables experts not only to estimate a predictive posterior but also to interpret models of interest and estimate the importance of predictor factors that are assumed to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014